94 research outputs found

    Scikit-Multiflow: A Multi-output Streaming Framework

    Full text link
    Scikit-multiflow is a multi-output/multi-label and stream data mining framework for the Python programming language. Conceived to serve as a platform to encourage democratization of stream learning research, it provides multiple state of the art methods for stream learning, stream generators and evaluators. scikit-multiflow builds upon popular open source frameworks including scikit-learn, MOA and MEKA. Development follows the FOSS principles and quality is enforced by complying with PEP8 guidelines and using continuous integration and automatic testing. The source code is publicly available at https://github.com/scikit-multiflow/scikit-multiflow.Comment: 5 pages, Open Source Softwar

    Evolution-Based Online Automated Machine Learning

    Get PDF
    International audienceAutomated Machine Learning (AutoML) deals with finding well-performing machine learning models and their corresponding configurations without the need of machine learning experts. However, if one assumes an online learning scenario, where an AutoML instance executes on evolving data streams, the question for the best model and its configuration with respect to occurring changes in the data distribution remains open. Algorithms developed for online learning settings rely on few and homogeneous models and do not consider data mining pipelines or the adaption of their configuration. We, therefore, introduce EvoAu-toML, an evolution-based online learning framework consisting of heterogeneous and connectable models that supports large and diverse configuration spaces and adapts to the online learning scenario. We present experiments with an implementation of EvoAutoML on a diverse set of synthetic and real datasets, and show that our proposed approach outperforms state-of-the-art online algorithms as well as strong ensemble baselines in a traditional test-then-train evaluation

    On ensemble techniques for data stream regression

    Get PDF
    An ensemble of learners tends to exceed the predictive performance of individual learners. This approach has been explored for both batch and online learning. Ensembles methods applied to data stream classification were thoroughly investigated over the years, while their regression counterparts received less attention in comparison. In this work, we discuss and analyze several techniques for generating, aggregating, and updating ensembles of regressors for evolving data streams. We investigate the impact of different strategies for inducing diversity into the ensemble by randomizing the input data (resampling, random subspaces and random patches). On top of that, we devote particular attention to techniques that adapt the ensemble model in response to concept drifts, including adaptive window approaches, fixed periodical resets and randomly determined windows. Extensive empirical experiments show that simple techniques can obtain similar predictive performance to sophisticated algorithms that rely on reactive adaptation (i.e., concept drift detection and recovery)

    Adaptive XGBoost for evolving data streams

    Get PDF
    Boosting is an ensemble method that combines base models in a sequential manner to achieve high predictive accuracy. A popular learning algorithm based on this ensemble method is eXtreme Gradient Boosting (XGB). We present an adaptation of XGB for classification of evolving data streams. In this setting, new data arrives over time and the relationship between the class and the features may change in the process, thus exhibiting concept drift. The proposed method creates new members of the ensemble from mini-batches of data as new data becomes available. The maximum ensemble size is fixed, but learning does not stop when this size is reached because the ensemble is updated on new data to ensure consistency with the current concept. We also explore the use of concept drift detection to trigger a mechanism to update the ensemble. We test our method on real and synthetic data with concept drift and compare it against batch-incremental and instance-incremental classification methods for data streams

    River: Machine learning for streaming data in Python

    Get PDF
    River is a machine learning library for dynamic data streams and continual learning. It provides multiple state-of-the-art learning methods, data generators/transformers, performance metrics and evaluators for different stream learning problems. It is the result from the merger of two popular packages for stream learning in Python: Creme and scikit- multiow. River introduces a revamped architecture based on the lessons learnt from the seminal packages. River's ambition is to be the go-to library for doing machine learning on streaming data. Additionally, this open source package brings under the same um-brella a large community of practitioners and researchers. The source code is available at https://github.com/online-ml/river

    Evidence for a narrow dip structure at 1.9 GeV/c2^2 in 3π+3π3\pi^+ 3\pi^- diffractive photoproduction

    Full text link
    A narrow dip structure has been observed at 1.9 GeV/c2^2 in a study of diffractive photoproduction of the  3π+3π~3\pi^+3\pi^- final state performed by the Fermilab experiment E687.Comment: The data of Figure 6 can be obtained by downloading the raw data file e687_6pi.txt. v5 (2nov2018): added Fig. 7, the 6 pion energy distribution as requested by a reade

    Hyperoxemia and excess oxygen use in early acute respiratory distress syndrome : Insights from the LUNG SAFE study

    Get PDF
    Publisher Copyright: © 2020 The Author(s). Copyright: Copyright 2020 Elsevier B.V., All rights reserved.Background: Concerns exist regarding the prevalence and impact of unnecessary oxygen use in patients with acute respiratory distress syndrome (ARDS). We examined this issue in patients with ARDS enrolled in the Large observational study to UNderstand the Global impact of Severe Acute respiratory FailurE (LUNG SAFE) study. Methods: In this secondary analysis of the LUNG SAFE study, we wished to determine the prevalence and the outcomes associated with hyperoxemia on day 1, sustained hyperoxemia, and excessive oxygen use in patients with early ARDS. Patients who fulfilled criteria of ARDS on day 1 and day 2 of acute hypoxemic respiratory failure were categorized based on the presence of hyperoxemia (PaO2 > 100 mmHg) on day 1, sustained (i.e., present on day 1 and day 2) hyperoxemia, or excessive oxygen use (FIO2 ≥ 0.60 during hyperoxemia). Results: Of 2005 patients that met the inclusion criteria, 131 (6.5%) were hypoxemic (PaO2 < 55 mmHg), 607 (30%) had hyperoxemia on day 1, and 250 (12%) had sustained hyperoxemia. Excess FIO2 use occurred in 400 (66%) out of 607 patients with hyperoxemia. Excess FIO2 use decreased from day 1 to day 2 of ARDS, with most hyperoxemic patients on day 2 receiving relatively low FIO2. Multivariate analyses found no independent relationship between day 1 hyperoxemia, sustained hyperoxemia, or excess FIO2 use and adverse clinical outcomes. Mortality was 42% in patients with excess FIO2 use, compared to 39% in a propensity-matched sample of normoxemic (PaO2 55-100 mmHg) patients (P = 0.47). Conclusions: Hyperoxemia and excess oxygen use are both prevalent in early ARDS but are most often non-sustained. No relationship was found between hyperoxemia or excessive oxygen use and patient outcome in this cohort. Trial registration: LUNG-SAFE is registered with ClinicalTrials.gov, NCT02010073publishersversionPeer reviewe

    Geoeconomic variations in epidemiology, ventilation management, and outcomes in invasively ventilated intensive care unit patients without acute respiratory distress syndrome: a pooled analysis of four observational studies

    Get PDF
    Background: Geoeconomic variations in epidemiology, the practice of ventilation, and outcome in invasively ventilated intensive care unit (ICU) patients without acute respiratory distress syndrome (ARDS) remain unexplored. In this analysis we aim to address these gaps using individual patient data of four large observational studies. Methods: In this pooled analysis we harmonised individual patient data from the ERICC, LUNG SAFE, PRoVENT, and PRoVENT-iMiC prospective observational studies, which were conducted from June, 2011, to December, 2018, in 534 ICUs in 54 countries. We used the 2016 World Bank classification to define two geoeconomic regions: middle-income countries (MICs) and high-income countries (HICs). ARDS was defined according to the Berlin criteria. Descriptive statistics were used to compare patients in MICs versus HICs. The primary outcome was the use of low tidal volume ventilation (LTVV) for the first 3 days of mechanical ventilation. Secondary outcomes were key ventilation parameters (tidal volume size, positive end-expiratory pressure, fraction of inspired oxygen, peak pressure, plateau pressure, driving pressure, and respiratory rate), patient characteristics, the risk for and actual development of acute respiratory distress syndrome after the first day of ventilation, duration of ventilation, ICU length of stay, and ICU mortality. Findings: Of the 7608 patients included in the original studies, this analysis included 3852 patients without ARDS, of whom 2345 were from MICs and 1507 were from HICs. Patients in MICs were younger, shorter and with a slightly lower body-mass index, more often had diabetes and active cancer, but less often chronic obstructive pulmonary disease and heart failure than patients from HICs. Sequential organ failure assessment scores were similar in MICs and HICs. Use of LTVV in MICs and HICs was comparable (42\ub74% vs 44\ub72%; absolute difference \u20131\ub769 [\u20139\ub758 to 6\ub711] p=0\ub767; data available in 3174 [82%] of 3852 patients). The median applied positive end expiratory pressure was lower in MICs than in HICs (5 [IQR 5\u20138] vs 6 [5\u20138] cm H2O; p=0\ub70011). ICU mortality was higher in MICs than in HICs (30\ub75% vs 19\ub79%; p=0\ub70004; adjusted effect 16\ub741% [95% CI 9\ub752\u201323\ub752]; p&lt;0\ub70001) and was inversely associated with gross domestic product (adjusted odds ratio for a US$10 000 increase per capita 0\ub780 [95% CI 0\ub775\u20130\ub786]; p&lt;0\ub70001). Interpretation: Despite similar disease severity and ventilation management, ICU mortality in patients without ARDS is higher in MICs than in HICs, with a strong association with country-level economic status. Funding: No funding
    corecore